智能论文笔记

GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages

Lakshmi Sireesha Vakada , Anudeep Ch , Mounika Marreddy , Subba Reddy Oota , Radhika Mamidi

分类：自然语言处理 | 机器学习

2022-12-25

Document summarization aims to create a precise and coherent summary of a text document. Many deep learning summarization models are developed mainly for English, often requiring a large training corpus and efficient pre-trained language models and tools. However, English summarization models for low-resource Indian languages are often limited by rich morphological variation, syntax, and semantic differences. In this paper, we propose GAE-ISumm, an unsupervised Indic summarization model that extracts summaries from text documents. In particular, our proposed model, GAE-ISumm uses Graph Autoencoder (GAE) to learn text representations and a document summary jointly. We also provide a manually-annotated Telugu summarization dataset TELSUM, to experiment with our model GAE-ISumm. Further, we experiment with the most publicly available Indian language summarization datasets to investigate the effectiveness of GAE-ISumm on other Indian languages. Our experiments of GAE-ISumm in seven languages make the following observations: (i) it is competitive or better than state-of-the-art results on all datasets, (ii) it reports benchmark results on TELSUM, and (iii) the inclusion of positional and cluster information in the proposed model improved the performance of summaries.

translated by 谷歌翻译

近年来，社交媒体已成长为许多在线用户的主要信息来源。这引起了错误信息通过深击的传播。 Deepfakes是视频或图像，代替一个人面对另一个计算机生成的面孔，通常是社会上更知名的人。随着技术的最新进展，技术经验很少的人可以产生这些视频。这使他们能够模仿社会中的权力人物，例如总统或名人，从而产生了传播错误信息和其他对深击的邪恶用途的潜在危险。为了应对这种在线威胁，研究人员开发了旨在检测深击的模型。这项研究着眼于各种深层检测模型，这些模型使用深度学习算法来应对这种迫在眉睫的威胁。这项调查着重于提供深层检测模型的当前状态的全面概述，以及许多研究人员采取的独特方法来解决此问题。在本文中，将对未来工作的好处，局限性和建议进行彻底讨论。

translated by 谷歌翻译